Goto

Collaborating Authors

 robot behavior


Interpretable Robot Control via Structured Behavior Trees and Large Language Models

Chekam, Ingrid Maéva, Pastor-Martinez, Ines, Tourani, Ali, Millan-Romera, Jose Andres, Ribeiro, Laura, Soares, Pedro Miguel Bastos, Voos, Holger, Sanchez-Lopez, Jose Luis

arXiv.org Artificial Intelligence

With the increasing presence of intelligent robots in everyday life, the demand for reliable and straightforward Human-Robot Interaction (HRI) interfaces is rapidly rising. Traditional robot control paradigms require users to learn particular commands [1] or interact with the robots through rigid user interfaces, especially in unstructured environments [2]. However, recent works target more flexible and adaptive communication strategies, unlocking the full potential of autonomous agents in human-centered environments. Accordingly, advances in generative AI and Large Language Models (LLMs) reveal new opportunities for enabling seamless communication between humans and robots, where natural language is the primary means of communication [3]. Such models are powerful enough to comprehend given instructions and even "reason" about the demanded tasks, intentions, and environmental context [4]. When paired with robotic perception and control systems, LLMs enable users to intuitively instruct the robot to perform complex tasks such as following multiple objects [5], navigating through dynamic scenes [6], or interacting with specific items [7], all using natural dialogue. Furthermore, integrating multimodal capabilities, including vision and speech, enhances HRI by enabling more natural, context-aware communication and improving adaptability across tasks and environments [8].


Deployment and Development of a Cognitive Teleoreactive Framework for Deep Sea Autonomy

Thierauf, Christopher

arXiv.org Artificial Intelligence

Abstract--A new AUV mission planning and execution software has been tested on AUV Sentry. Dubbed DINOS-R, it draws inspiration from cognitive architectures and AUV control systems to replace the legacy MC architecture. Unlike these existing architectures, however, DINOS-R is built from the ground-up to unify symbolic decision making (for understandable, repeatable, provable behavior) with machine learning techniques and reactive behaviors, for field-readiness across oceanographic platforms. Implemented primarily in Python3, DINOS-R is extensible, modular, and reusable, with an emphasis on non-expert use as well as growth for future research in oceanography and robot algorithms. Mission specification is flexible, and can be specified declaratively. Behavior specification is similarly flexible, supporting simultaneous use of real-time task planning and hard-coded user specified plans. These features were demonstrated in the field on Sentry, in addition to a variety of simulated cases. These results are discussed, and future work is outlined. In particular, although the MC (Mission Controller) system in use on AUV Sentry has repeatedly proven itself for lawnmower patterns, it presents several key limitations stemming from its rigid implementation. Most notably, it is capable of executing basic "go-to" commands and similar functionality, but was not engineered for scalability to new mission modalities or real-time interventions.


ERR@HRI 2.0 Challenge: Multimodal Detection of Errors and Failures in Human-Robot Conversations

Cao, Shiye, Stiber, Maia, Mahmood, Amama, Parreira, Maria Teresa, Ju, Wendy, Spitale, Micol, Gunes, Hatice, Huang, Chien-Ming

arXiv.org Artificial Intelligence

The integration of large language models (LLMs) into conversational robots has made human-robot conversations more dynamic. Yet, LLM-powered conversational robots remain prone to errors, e.g., misunderstanding user intent, prematurely interrupting users, or failing to respond altogether. Detecting and addressing these failures is critical for preventing conversational breakdowns, avoiding task disruptions, and sustaining user trust. To tackle this problem, the ERR@HRI 2.0 Challenge provides a multimodal dataset of LLM-powered conversational robot failures during human-robot conversations and encourages researchers to benchmark machine learning models designed to detect robot failures. The dataset includes 16 hours of dyadic human-robot interactions, incorporating facial, speech, and head movement features. Each interaction is annotated with the presence or absence of robot errors from the system perspective, and perceived user intention to correct for a mismatch between robot behavior and user expectation. Participants are invited to form teams and develop machine learning models that detect these failures using multimodal data. Submissions will be evaluated using various performance metrics, including detection accuracy and false positive rate. This challenge represents another key step toward improving failure detection in human-robot interaction through social signal analysis.


Multifunctional physical reservoir computing in soft tensegrity robots

Terajima, Ryo, Inoue, Katsuma, Nakajima, Kohei, Kuniyoshi, Yasuo

arXiv.org Artificial Intelligence

Recent studies have demonstrated that the dynamics of physical systems can be utilized for the desired information processing under the framework of physical reservoir computing (PRC). Robots with soft bodies are examples of such physical systems, and their nonlinear body-environment dynamics can be used to compute and generate the motor signals necessary for the control of their own behavior. In this simulation study, we extend this approach to control and embed not only one but also multiple behaviors into a type of soft robot called a tensegrity robot. The resulting system, consisting of the robot and the environment, is a multistable dynamical system that converges to different attractors from varying initial conditions. Furthermore, attractor analysis reveals that there exist "untrained attractors" in the state space of the system outside the training data. These untrained attractors reflect the intrinsic properties and structures of the tensegrity robot and its interactions with the environment. The impacts of these recent findings in PRC remain unexplored in embodied AI research. We here illustrate their potential to understand various features of embodied cognition that have not been fully addressed to date.


Designing for Difference: How Human Characteristics Shape Perceptions of Collaborative Robots

Livanec, Sabrina, Londoño, Laura, Gorki, Michael, Röfer, Adrian, Valada, Abhinav, Kiesel, Andrea

arXiv.org Artificial Intelligence

The development of assistive robots for social collaboration raises critical questions about responsible and inclusive design, especially when interacting with individuals from protected groups such as those with disabilities or advanced age. Currently, research is scarce on how participants assess varying robot behaviors in combination with diverse human needs, likely since participants have limited real-world experience with advanced domestic robots. In the current study, we aim to address this gap while using methods that enable participants to assess robot behavior, as well as methods that support meaningful reflection despite limited experience. In an online study, 112 participants (from both experimental and control groups) evaluated 7 videos from a total of 28 variations of human-robot collaboration types. The experimental group first completed a cognitive-affective mapping (CAM) exercise on human-robot collaboration before providing their ratings. Although CAM reflection did not significantly affect overall ratings, it led to more pronounced assessments for certain combinations of robot behavior and human condition. Most importantly, the type of human-robot collaboration influences the assessment. Antisocial robot behavior was consistently rated as the lowest, while collaboration with aged individuals elicited more sensitive evaluations. Scenarios involving object handovers were viewed more positively than those without them. These findings suggest that both human characteristics and interaction paradigms influence the perceived acceptability of collaborative robots, underscoring the importance of prosocial design. They also highlight the potential of reflective methods, such as CAM, to elicit nuanced feedback, supporting the development of user-centered and socially responsible robotic systems tailored to diverse populations.


Gear News of the Week: Samsung's Trifold Promise, Ikea's Sonos Split, and Hugging Face's New Robot

WIRED

Samsung's Galaxy Unpacked event in Brooklyn earlier this week debuted seven new devices, from the Galaxy Z Fold7 to the Galaxy Watch8 series. But there weren't any surprises at the end, despite rumors that Samsung would unveil a trifold phone. Sensing disappointment, the company later confirmed that the phone is expected to land in 2025. "I expect we will be able to launch the trifold phone within this year," TM Roh, head of Samsung's mobile business, told The Korea Times. The trifold phone, rumored to be called the Galaxy G Fold, would have a normal screen on the front and two hinges that let you open it up as a tablet-sized screen.


Effective Explanations for Belief-Desire-Intention Robots: When and What to Explain

Wang, Cong, Calandra, Roberto, Klös, Verena

arXiv.org Artificial Intelligence

When robots perform complex and context-dependent tasks in our daily lives, deviations from expectations can confuse users. Explanations of the robot's reasoning process can help users to understand the robot intentions. However, when to provide explanations and what they contain are important to avoid user annoyance. We have investigated user preferences for explanation demand and content for a robot that helps with daily cleaning tasks in a kitchen. Our results show that users want explanations in surprising situations and prefer concise explanations that clearly state the intention behind the confusing action and the contextual factors that were relevant to this decision. Based on these findings, we propose two algorithms to identify surprising actions and to construct effective explanations for Belief-Desire-Intention (BDI) robots. Our algorithms can be easily integrated in the BDI reasoning process and pave the way for better human-robot interaction with context- and user-specific explanations.


Rude Humans and Vengeful Robots: Examining Human Perceptions of Robot Retaliatory Intentions in Professional Settings

Letheren, Kate, Robinson, Nicole

arXiv.org Artificial Intelligence

Humans and robots are increasingly working in personal and professional settings. In workplace settings, humans and robots may work together as colleagues, potentially leading to social expectations, or violation thereof. Extant research has primarily sought to understand social interactions and expectations in personal rather than professional settings, and none of these studies have examined negative outcomes arising from violations of social expectations. This paper reports the results of a 2x3 online experiment that used a unique first-person perspective video to immerse participants in a collaborative workplace setting. The results are nuanced and reveal that while robots are expected to act in accordance with social expectations despite human behavior, there are benefits for robots perceived as being the bigger person in the face of human rudeness. Theoretical and practical implications are provided which discuss the import of these findings for the design of social robots.


On the Fly Adaptation of Behavior Tree-Based Policies through Reinforcement Learning

Iannotta, Marco, Stork, Johannes A., Schaffernicht, Erik, Stoyanov, Todor

arXiv.org Artificial Intelligence

With the rising demand for flexible manufacturing, robots are increasingly expected to operate in dynamic environments where local disturbances--such as slight offsets or size differences in workpieces--are common. We propose to address the problem of adapting robot behaviors to these task variations with a sample-efficient hierarchical reinforcement learning approach adapting Behavior Tree (BT)-based policies. We maintain the core BT properties as an interpretable, modular framework for structuring reactive behaviors, but extend their use beyond static tasks by inherently accommodating local task variations. To show the efficiency and effectiveness of our approach, we conduct experiments both in simulation and on a Franka Emika Panda 7-DoF, with the manipulator adapting to different obstacle avoidance and pivoting tasks.


Contrastive Learning from Exploratory Actions: Leveraging Natural Interactions for Preference Elicitation

Dennler, Nathaniel, Nikolaidis, Stefanos, Matarić, Maja

arXiv.org Artificial Intelligence

People have a variety of preferences for how robots behave. To understand and reason about these preferences, robots aim to learn a reward function that describes how aligned robot behaviors are with a user's preferences. Good representations of a robot's behavior can significantly reduce the time and effort required for a user to teach the robot their preferences. Specifying these representations -- what "features" of the robot's behavior matter to users -- remains a difficult problem; Features learned from raw data lack semantic meaning and features learned from user data require users to engage in tedious labeling processes. Our key insight is that users tasked with customizing a robot are intrinsically motivated to produce labels through exploratory search; they explore behaviors that they find interesting and ignore behaviors that are irrelevant. To harness this novel data source of exploratory actions, we propose contrastive learning from exploratory actions (CLEA) to learn trajectory features that are aligned with features that users care about. We learned CLEA features from exploratory actions users performed in an open-ended signal design activity (N=25) with a Kuri robot, and evaluated CLEA features through a second user study with a different set of users (N=42). CLEA features outperformed self-supervised features when eliciting user preferences over four metrics: completeness, simplicity, minimality, and explainability.